Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎯 Reinforcement Learning
RLHF, Reward Models, Policy Optimization, AlphaGo
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
120609
posts in
1.13
s
Reward
Modeling for Reinforcement Learning-Based LLM Reasoning: Design, Challenges, and
Evaluation
arxiv.org
·
1d
🤖
LLM
Risk-sensitive reinforcement learning using
expectiles
,
shortfall
risk and optimized certainty equivalent risk
arxiv.org
·
1d
🤖
AI
check out this article on Reinforcement Learning with R:
Origins
, Real-Life Applications, and Practical
Implementation
dev.to
·
2d
·
Discuss:
DEV
🤖
LLM
A multi-agent reinforcement learning approach to autonomous aircraft
taxiing
with
taxiing
time, fuel consumption, and
emission
optimization
sciencedirect.com
·
22h
🤖
AI
Show HN: Fighting the War Against
Expensive
Reinforcement
Learning
cadenza-landing-qtu7gbjwb-akshparekh123-3457s-projects.vercel.app
·
4h
·
Discuss:
Hacker News
🤖
LLM
Recursive
self-improvement
from AI models
marginalrevolution.com
·
1d
·
Discuss:
Hacker News
🎨
Multimodal AI
A training
principle
for
drifting
models
breno.bearblog.dev
·
45m
🤖
LLM
ashworks1706/rlhf-from-scratch
: A theoretical and practical deep dive into Reinforcement Learning with Human Feedback and it’s applications in Large Language Models from scratch.
github.com
·
2d
·
Discuss:
Hacker News
🤖
LLM
JupyterPS/VBAF
: Visual Business Automation Framework - PowerShell-based reinforcement learning for education and business automation
github.com
·
1d
·
Discuss:
Hacker News
🤖
LLM
Observe
emergent
behavior in autonomous multi-agent LLM networks
agents.glide2.app
·
1d
·
Discuss:
Hacker News
🤖
LLM
Researchers propose a self-distillation fix for ‘
catastrophic
forgetting
’ in LLMs
infoworld.com
·
1h
🤖
LLM
Backtracking
Algorithms
algos.khourani.com
·
1d
🤖
LLM
Robotics
Motion Learning: Training Linked Robot Arms with
Kuramoto
Models
hackernoon.com
·
20h
🎨
Multimodal AI
Show HN: A
minimal
online decision maker
decisionmaker.online
·
22h
·
Discuss:
Hacker News
🤖
AI
Generalized
Lanczos
method for systematic optimization of neural-network quantum states
link.aps.org
·
1h
🤖
LLM
Architectural and Mathematical
Foundations
of Machine Learning: A
Rigorous
Synthesis of Theory, Geometry, and Implementation
chizkidd.github.io
·
22h
·
Discuss:
Hacker News
🤖
AI
Behavioral economics-oriented energy storage investment analysis: A
holistic
decision support model with advanced
fuzzy
techniques
sciencedirect.com
·
19h
🤖
LLM
Magic
Tricks
,
Moats
, and the Three-Body Problem of AI Networks
caseyaccidental.com
·
20h
🎨
Multimodal AI
Steps
to set up the game
meapps.itch.io
·
19h
🤖
AI
Multi AI Agent Systems with
crewAI
deeplearning.ai
·
38m
🤖
AI
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help